19 research outputs found

    Street Name Data as a Reflection of Migration and Settlement History

    Get PDF
    Street names (odonyms) play an important role not only as descriptors of geographic locations but also due to their sociological and political connotations and commemorative character. Here we analyse street names in Europe and North America extracted from OpenStreetMap, asking in particular to what extent odonyms reflect early European settlements in the New World, i.e., the immigration of German, Austrian and Scandinavian minorities. We observe that old street names of European origin can predominantly be found in rural areas. North American street names indeed recapitulate local and regional settlement histories. The aim of this study is to demonstrate that easily accessible data sets from freely available map data such as street names convey usable information concerning migration patterns and the history of settlements in the case of European immigrants in North America as well as colonial history. We provide a freely available pipeline to analyse this kind of data

    Automated Design of Dynamic Programming Schemes for RNA Folding with Pseudoknots

    Get PDF
    Despite being a textbook application of dynamic programming (DP) and routine task in RNA structure analysis, RNA secondary structure prediction remains challenging whenever pseudoknots come into play. To circumvent the NP-hardness of energy minimization in realistic energy models, specialized algorithms have been proposed for restricted conformation classes that capture the most frequently observed configurations. While these methods rely on hand-crafted DP schemes, we generalize and fully automatize the design of DP pseudoknot prediction algorithms. We formalize the problem of designing DP algorithms for an (infinite) class of conformations, modeled by (a finite number of) fatgraphs, and automatically build DP schemes minimizing their algorithmic complexity. We propose an algorithm for the problem, based on the tree-decomposition of a well-chosen representative structure, which we simplify and reinterpret as a DP scheme. The algorithm is fixed-parameter tractable for the tree-width tw of the fatgraph, and its output represents a ?(n^{tw+1}) algorithm for predicting the MFE folding of an RNA of length n. Our general framework supports general energy models, partition function computations, recursive substructures and partial folding, and could pave the way for algebraic dynamic programming beyond the context-free case

    Street Name Data as a Reflection of Migration and Settlement History

    No full text
    Street names (odonyms) play an important role not only as descriptors of geographic locations but also due to their sociological and political connotations and commemorative character. Here we analyse street names in Europe and North America extracted from OpenStreetMap, asking in particular to what extent odonyms reflect early European settlements in the New World, i.e., the immigration of German, Austrian and Scandinavian minorities. We observe that old street names of European origin can predominantly be found in rural areas. North American street names indeed recapitulate local and regional settlement histories. The aim of this study is to demonstrate that easily accessible data sets from freely available map data such as street names convey usable information concerning migration patterns and the history of settlements in the case of European immigrants in North America as well as colonial history. We provide a freely available pipeline to analyse this kind of data

    Street Name Data as a Reflection of Migration and Settlement History

    No full text
    Street names (odonyms) play an important role not only as descriptors of geographic locations but also due to their sociological and political connotations and commemorative character. Here we analyse street names in Europe and North America extracted from OpenStreetMap, asking in particular to what extent odonyms reflect early European settlements in the New World, i.e., the immigration of German, Austrian and Scandinavian minorities. We observe that old street names of European origin can predominantly be found in rural areas. North American street names indeed recapitulate local and regional settlement histories. The aim of this study is to demonstrate that easily accessible data sets from freely available map data such as street names convey usable information concerning migration patterns and the history of settlements in the case of European immigrants in North America as well as colonial history. We provide a freely available pipeline to analyse this kind of data

    Street Name Data as a Reflection of Migration and Settlement History

    No full text
    Street names (odonyms) play an important role not only as descriptors of geographic locations but also due to their sociological and political connotations and commemorative character. Here we analyse street names in Europe and North America extracted from OpenStreetMap, asking in particular to what extent odonyms reflect early European settlements in the New World, i.e., the immigration of German, Austrian and Scandinavian minorities. We observe that old street names of European origin can predominantly be found in rural areas. North American street names indeed recapitulate local and regional settlement histories. The aim of this study is to demonstrate that easily accessible data sets from freely available map data such as street names convey usable information concerning migration patterns and the history of settlements in the case of European immigrants in North America as well as colonial history. We provide a freely available pipeline to analyse this kind of data

    Algebraic Dynamic Programming on Trees

    No full text
    Where string grammars describe how to generate and parse strings, tree grammars describe how to generate and parse trees. We show how to extend generalized algebraic dynamic programming to tree grammars. The resulting dynamic programming algorithms are efficient and provide the complete feature set available to string grammars, including automatic generation of outside parsers and algebra products for efficient backtracking. The complete parsing infrastructure is available as an embedded domain-specific language in Haskell. In addition to the formal framework, we provide implementations for both tree alignment and tree editing. Both algorithms are in active use in, among others, the area of bioinformatics, where optimization problems on trees are of considerable practical importance. This framework and the accompanying algorithms provide a beneficial starting point for developing complex grammars with tree- and forest-based inputs

    Infrared: a declarative tree decomposition-powered framework for bioinformatics

    No full text
    Motivation: Many bioinformatics problems can be approached as optimization or controlled sampling tasks, and solved exactly and efficiently using Dynamic Programming (DP). However, such exact methods are typically tailored towards specific settings, complex to develop, and hard to implement and adapt to problem variations. Methods: We introduce the Infrared framework to overcome such hindrances for a large class of problems. Its underlying paradigm is tailored toward problems that can be declaratively formalized as sparse feature networks, a generalization of constraint networks. Classic Boolean constraints specify a search space, consisting of putative solutions whose evaluation is performed through a combination of features. Problems are then solved using generic cluster tree elimination algorithms over a tree decomposition of the feature network. Their overall complexities are linear on the number of variables, and only exponential on the treewidth of the feature network. For sparse feature networks, associated with low to moderate treewidths, these algorithms allow to find optimal solutions, or generate controlled samples, with practical empirical efficiency. Results: Implementing these methods, the Infrared software allows Python programmers to rapidly develop exact optimization and sampling applications based on a tree decomposition-based efficient processing. Instead of directly coding specialized algorithms, problems are declaratively modeled as sets of variables over finite domains, whose dependencies are captured by constraints and functions. Such models are then automatically solved by generic DP algorithms. To illustrate the applicability of Infrared in bioinformatics and guide new users, we model and discuss variants of bioinformatics applications. We provide reimplementations (and extensions) for methods targeting RNA design, RNA sequence-structure alignment, parsimony-driven inference of ancestral traits in phylogenetic trees/networks, and coding sequence design demonstrate multidimensional Boltzmann sampling. Previous work together with novel results demonstrate the practical relevance of the framework, whose complexity is typically equivalent or better than specialized algorithms and implementations.Infrared is available at https://www.lix.polytechnique.fr/~will/Software/Infrared with extensive documentation, including various usage examples and API reference; it can be installed using Conda or from source

    SMORE: Synteny Modulator of Repetitive Elements

    No full text
    Several families of multicopy genes, such as transfer ribonucleic acids (tRNAs) and ribosomal RNAs (rRNAs), are subject to concerted evolution, an effect that keeps sequences of paralogous genes effectively identical. Under these circumstances, it is impossible to distinguish orthologs from paralogs on the basis of sequence similarity alone. Synteny, the preservation of relative genomic locations, however, also remains informative for the disambiguation of evolutionary relationships in this situation. In this contribution, we describe an automatic pipeline for the evolutionary analysis of such cases that use genome-wide alignments as a starting point to assign orthology relationships determined by synteny. The evolution of tRNAs in primates as well as the history of the Y RNA family in vertebrates and nematodes are used to showcase the method. The pipeline is freely available
    corecore